Information gain is an information measure used in decision tree building algorithms, notably ID3. Given a classification set C and a feature or decsion formula that creates a grouping set G
gain(C,G) = Σc,g pc,glog2pc,g − Σc pclog2pc
where pc is the frequency of items of class c in the data and pc,g is the frequency of items of class c and in grouping g.
gain(C,G) = Σc,g pc,glog2pc,g − Σc pclog2pc
where pc is the frequency of items of class c in the data and pc,g is the frequency of items of class c and in grouping g.
Used in glossary entries: decision tree, ID3
Links:
Wikipedia: Information gain (decision tree)
link.springer.com: Induction of Decision Trees (Quinlan)
Wikipedia: Mutual_information – Variations